Euphoria: A fast and simple interpreted language

By Alexander Toresson

Hello folks! Today I'm gonna introduce a rather unknown programming language to you, the Euphoria programming language. It is one of the fastest interpreted programming languages around (measured by the prime sieve benchmark), maybe even the fastest, even though it features subscript checking, type checking, advanced memory structures and more. Seems impossible? Make your own judgement!

Euphoria runs on Windows, DOS, Linux and FreeBSD. Those advanced memory structures are so-called 'sequences', and are one of the key features in Euphoria. They are extremely dynamic and can be used to store almost anything. They store a sequence of numbers or sequences of numbers, recursive to any level.

It's little hard to explain, so here's an example:

{1, 2, 3, 4}
is a sequence with four numbers

{1, 2, {1.2, 1.3}, 5.5, 6}
is a sequence with: two numbers, a sequence with two numbers, and two numbers. Note the mix of integers and floats.

The structure of them can be changed at runtime:

a = {1, 5, 3}
a[2] = {4, 6} -- a[4] would result in an error,
              -- because of subscript-checking.
? a           -- prints {1, {4, 6}, 3}

If you're familiar with C, you should understand the syntax of this. This makes sequences a very powerful type.

How do you then create a big sequence without having to write all items out, or a sequence which I do not know how big it will be until that instruction is executed? Here's the way:

a = repeat(0, 10)           -- a contains {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
b = repeat({1, 2}, 3)       -- b contains {{1, 2}, {1, 2}, {1, 2}}
c = repeat(repeat(0, 2), 2) -- c contains {{{0}, {0}}, {{0}, {0}}}

Note that the sequence c contains the equivalent to a 2x2 array, matrix, or whatever.

How do you then declare a variable? It's done in this way:

object a sequence b atom c integer d

This declares an object with the name a, a sequence with the name b etc... The term object has nothing to do with object-orientated programming, it's just a type which can contain either a sequence or an atom.

If I haven't mentioned it, Euphoria is not an OOP language, it achieves the same things in a better and simpler way.

An atom is a type which can contain either a 32-bit integer or a 64-bit float. The name 'atom' signifies that it can only contain one value.

An 'integer' can contain a 31-bit integer (Euphoria reserves one bit for its own use).

Note that I say *can* contain. Euphoria has got initialization-checking, so writing:

integer a, b
a = b

would result in an error, because b doesn't contain anything yet.

After reading this, you may ask how strings are stored. The answer is simple. As sequences. For example, "ABCDEFG" is stored as {65, 66, 67, 68, 69, 70, 71}.

Now for typechecking:

integer a
atom b

b = 1.2
a = b

will give a typecheck error.

The syntax of Euphoria is very simple. The following keywords are reserved:

    and            end             include          to
    by             exit            not              type
    constant       for             or               while
    do             function        procedure        with
    else           global          return           without
    elsif          if              then             xor

Most of them are used in the same way, whether they are called the same thing or not, as they are used in vb. Though some things are borrowed from c, like the include and the return statements.

Here's a piece of sample code:

 sequence list, sorted_list             -- declare local variables

 function merge_sort(sequence x)
 -- put x into ascending order using a recursive merge sort
     integer n, mid                      -- declare private variables
     sequence merged, a, b

     n = length(x)                     -- get the 'length' of the input 
     if n = 0 or n = 1 then            -- sequence, ie the number of
         return x                      -- items in the 'root' of the
     end if                            -- sequence.

     mid = floor(n/2)                -- divide by 2 and round down
     a = merge_sort(x[1..mid])       -- sort first half of x
     b = merge_sort(x[mid+1..n])     -- sort second half of x

     -- merge the two sorted halves into one
     merged = {}                               -- empty sequence
     -- continue while both a and b contains at least one item
     while length(a) > 0 and length(b) > 0 do
         -- this is the way sequences are compared in Euphoria
         -- you cannot use a[1] < b[1], see below for reason
         if compare(a[1], b[1]) < 0 then
             -- add a[1] at the end of merged
             merged = append(merged, a[1])
             -- get all except the first item
             a = a[2..length(a)]
         else
             merged = append(merged, b[1])
             b = b[2..length(b)]
         end if
     end while
     return merged &emp; a &emp; b  -- merged data plus leftovers
 end function        -- a function always returns something, ie not void

 procedure print_sorted_list()
 -- generate sorted_list from list
     list = {9, 10, 3, 1, 4, 5, 8, 7, 6, 2}
     sorted_list = merge_sort(list)
     -- note: merge_sort returns a sequence without any problem
     -- blessed be the C programmers
     ? sorted_list
 end procedure       -- a procedure doesn't return anything
                     -- it is the equal to a void function
 print_sorted_list()     -- this command starts the program

This is a program which can sort any sequence you give it. It can sort {1.5, -9, 1e6, 100} or {"oranges", "apples", "bananas"} just as easy as it sorts the default sequence. All with the same code.

Now you may want to know why the heck one can't use >, <, = etc right between sequences. Well, you can, though it wouldn't yield the result one would want in an if statement. If I would do, say, {1, 1, 0} = {0, 1, 1}, it would return {0, 1, 0}! That's because Euphoria applies all operators onto all items in a sequence, and that's the reason there exists a function which compares the full sequence to another sequence. You can use this feature to your advantage:

a = repeat(2, 100)
b = repeat(1, 100)
c = a + b -- a hundred numbers added together
in one instruction!
ie c contains a hundred 3's
c+= 1 -- all numbers in c got increased by one

The observant reader may wonder what the difference between using the function append() and the operator &emp; may be. Of course prepend() also exists. The difference is that append() adds an object as the last item to a sequence, and &emp; puts together to objects. When putting together a sequence and an atom, the two following is exactly the same:

a = append({1, 2}, 3) -- gives {1, 2, 3}
b = {1, 2} &emp; 3 -- also gives {1, 2, 3}

However, when giving them two sequences, the result is different:

a = append({1, 2}, {3}) -- adds {3} as last item, giving {1, 2, {3}}
b = {1, 2} &emp; {3} -- puts together {1, 2} and {3}, giving {1, 2, 3}

You may also have noticed that some variables were local and some were private. Private ones are the ones declared inside a function or procedure, and can only be accessed from that function or procedure. Local variables are the ones that are declared outside any function, and can be accessed from the whole file. And then there's global variables, which are declared like local ones, but with 'global' added before the type in a variable declaration, and can be accessed from any file.

So, why use Euphoria?

* Simple syntax, subscript checking, type checking, initialization checking, always meaningful errors, flexible types => stable code, easy coding &emp; easy debugging

* Not edit/compile/link/run, but just edit/run

* Speed like no other interpreted language

* Your program will almost never fail with an exception. Even if you use the procedure poke() and, by accident, you try to change something that you're not allowed to change, Euphoria will most often catch you and report a meaningful error. Dunno if you c programmers know of that procedure, but the problem is the equality to having a wild pointer.

* Portable between Windows, DOS, Linux, FreeBSD, and soon MacOS X.

There's one site on the net that offers space for uploading Euphoria programs, and that's the official homepage. It also features a highly active forum/mailing list. Euphoria has been around since '93, and that makes that site a big archive with almost everything you'd want, if it ain't too exotic. Almost everything is open source. There's a win32 gui library(win32lib), a gtk library, several opengl libraries, and a lot more. Euphoria lives on user contributions.

Euphoria comes in one free version and one commercial version. The free version has some limitations, mainly no compilation and no tracing of programs with > 300 lines. The commercial version costs only around 25 bucks.

Want more speed? There's an assembler made for use within a Euphoria program, which makes it easy to have some part of your program written in asm. There's also an eu-to-c translator, which makes Euphoria programs run almost as fast as C. The free version of it just displays a message for some seconds at start of execution, otherwise there's no difference between the commercial and the free version of it.

This article is just a peek on the surface. Go to www.rapideuphoria.com for more info. If you got interested, rather use that site as a reference than me.

Alexander Toresson

Shhh! Be vewy quiet! I'm hunting wuntime ewwows!